Informativeness of genetic markers for inference of ancestry.
نویسندگان
چکیده
Inference of individual ancestry is useful in various applications, such as admixture mapping and structured-association mapping. Using information-theoretic principles, we introduce a general measure, the informativeness for assignment (I(n)), applicable to any number of potential source populations, for determining the amount of information that multiallelic markers provide about individual ancestry. In a worldwide human microsatellite data set, we identify markers of highest informativeness for inference of regional ancestry and for inference of population ancestry within regions; these markers, which are listed in online-only tables in our article, can be useful both in testing for and in controlling the influence of ancestry on case-control genetic association studies. Markers that are informative in one collection of source populations are generally informative in others. Informativeness of random dinucleotides, the most informative class of microsatellites, is five to eight times that of random single-nucleotide polymorphisms (SNPs), but 2%-12% of SNPs have higher informativeness than the median for dinucleotides. Our results can aid in decisions about the type, quantity, and specific choice of markers for use in studies of ancestry.
منابع مشابه
Combining markers into haplotypes can improve population structure inference.
High-throughput genotyping and sequencing technologies can generate dense sets of genetic markers for large numbers of individuals. For most species, these data will contain many markers in linkage disequilibrium (LD). To utilize such data for population structure inference, we investigate the use of haplotypes constructed by combining the alleles at single-nucleotide polymorphisms (SNPs). We i...
متن کاملInferring ancestry from population genomic data and its applications
Ancestry inference is a frequently encountered problem and has many applications such as forensic analyses, genetic association studies, and personal genomics. The main goal of ancestry inference is to identify an individual's population of origin based on our knowledge of natural populations. Because both self-reported ancestry in humans or the sampling location of an organism can be inaccurat...
متن کاملIndividual Identifiability Predicts Population Identifiability in Forensic Microsatellite Markers
Highly polymorphic genetic markers with significant potential for distinguishing individual identity are used as a standard tool in forensic testing [1, 2]. At the same time, population-genetic studies have suggested that genetically diverse markers with high individual identifiability also confer information about genetic ancestry [3-6]. The dual influence of polymorphism levels on ancestry in...
متن کاملTowards an Information-Theoretic Approach to Population Structure
This paper uses an information-theoretic perspective to propose multi-locus informativeness measures for ancestry inference. These measures describe the potential for correct classification of unknown individuals to their source populations, given genetic data on population structure. Motivated by Shannon‟s axiomatic approach in deriving a unique information measure for communication (Shannon 1...
متن کاملMulti-InDel Analysis for Ancestry Inference of Sub-Populations in China
Ancestry inference is of great interest in diverse areas of scientific researches, including the forensic biology, medical genetics and anthropology. Various methods have been published for distinguishing populations. However, few reports refer to sub-populations (like ethnic groups) within Asian populations for the limitation of markers. Several InDel loci located very tightly in physical posi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- American journal of human genetics
دوره 73 6 شماره
صفحات -
تاریخ انتشار 2003